Search CORE

205 research outputs found

Des modèles biologiques à l'amélioration des plantes

Author: Rivals E.
Publication venue: 'Aufklarung Journal of Philosophy'
Publication date: 01/01/2001
Field of study

Superstrings with multiplicities

Author: Cazaux B.
Rivals E.
Publication venue: Schloss Dagstuhl Leibniz Center for Informatics
Publication date: 01/01/2018
Field of study

A superstring of a set of words P = s1, · · · , sp is a string that contains each word of P as substring. Given P, the well known Shortest Linear Superstring problem (SLS), asks for a shortest superstring of P. In a variant of SLS, called Multi-SLS, each word si comes with an integer m(i), its multiplicity, that sets a constraint on its number of occurrences, and the goal is to find a shortest superstring that contains at least m(i) occurrences of si. Multi-SLS generalizes SLS and is obviously as hard to solve, but it has been studied only in special cases (with words of length 2 or with a fixed number of words). The approximability of Multi-SLS in the general case remains open. Here, we study the approximability of Multi-SLS and that of the companion problem Multi-SCCS, which asks for a shortest cyclic cover instead of shortest superstring. First, we investigate the approximation of a greedy algorithm for maximizing the compression offered by a superstring or by a cyclic cover: the approximation ratio is 1/2 for Multi-SLS and 1 for Multi-SCCS. Then, we exhibit a linear time approximation algorithm, Concat-Greedy, and show it achieves a ratio of 4 regarding the superstring length. This demonstrates that for both measures Multi-SLS belongs to the class of APX problems. © 2018 Yoshifumi Sakai; licensed under Creative Commons License CC-BY.Peer reviewe

INRIA a CCSD electronic archive server

Dagstuhl Research Online Publication Server

Helsingin yliopiston digitaalinen arkisto

Linking BWT and XBW via Aho-Corasick automaton : Applications to run-length encoding

Author: Cazaux B.
Rivals E.
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 25/05/2018
Field of study

Peer reviewe

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Dagstuhl Research Online Publication Server

Helsingin yliopiston digitaalinen arkisto

Detection of recombination in variable number tandem repeat sequences

Author: Adebiyi E. F.
Rivals Eric
Publication venue
Publication date: 01/01/2007
Field of study

Tandem repeats are repeated sequences whose copies are adjacent along the chromosomes. They account for large portion of eukaryotic genomes and are found in all types of living organisms. Among tandem repeats, those with repeat unit of middle size are called minisatellites. These loci depart from classical loci because of the propensity to vary in size due to the addition or the removal of one or more repeat units. Due to this polymorphism, they prove useful in genetic mapping, in population genetics, and forensic medicine. Moreover, some specific tandem repeat loci are involved in diseases, like the insulin minisatellite, which is implicated in type I diabetes and obesity. Those loci also undergo complex recombination events. Presently, some programs to compare tandem repeats alleles exist and yield good results when recombination is absent, but none correctly handles recombinant alleles. Our goal is to develop an adequate tool for the detection of recombinant among a set of minisatellite sequences. By combining a multiple alignment tool and a method based on phylogenetic profiling, we design a first solution, called MS_PhylPro, for this task. The method has been implemented, tested on real data sets from the insulin minisatellite, and proven to detect recombinant allele

Covenant University Repository

Convergence of the number of period sets in strings

Author: Rivals E. (Eric)
Sweering M.J.M. (Michelle)
Wang P.
Publication venue
Publication date: 02/05/2022
Field of study

Consider words of length n. The set of all periods of a word of length n is a subset of {0,1,2,…,n−1}. However, any subset of {0,1,2,…,n−1} is not necessarily a valid set of periods. In a seminal paper in 1981, Guibas and Odlyzko have proposed to encode the set of periods of a word into an n long binary string, called an autocorrelation, where a one at position i denotes a period of i. They considered the question of recognizing a valid period set, and also studied the number of valid period sets for length n, denoted κ_n. They conjectured that ln(κ_n) asymptotically converges to a constant times ln^2(n). If improved lower bounds for ln(κ_n)/ln^2(n) were proposed in 2001, the question of a tight upper bound has remained opened since Guibas and Odlyzko's paper. Here, we exhibit an upper bound for this fraction, which implies its convergence and closes this long standing conjecture. Moreover, we extend our result to find similar bounds for the number of correlations: a generalization of autocorrelations which encodes the overlaps between two strings

CWI's Institutional Repository

INRIA a CCSD electronic archive server

HAL Descartes

Dagstuhl Research Online Publication Server

Hal-Diderot

CpG-ODN-induced sustained expression of BTLA mediating selective inhibition of human B cells.

Author: Derré L.
Gertner-Dardenne J.
Mamessier E.
Olive D.
Pastor S.
Rivals J.P.
Speiser D.E.
Thibult M.L.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

BTLA (B- and T-lymphocyte attenuator) is a prominent co-receptor that is structurally and functionally related to CTLA-4 and PD-1. In T cells, BTLA inhibits TCR-mediated activation. In B cells, roles and functions of BTLA are still poorly understood and have never been studied in the context of B cells activated by CpG via TLR9. In this study, we evaluated the expression of BTLA depending on activation and differentiation of human B cell subsets in peripheral blood and lymph nodes. Stimulation with CpG upregulated BTLA, but not its ligand: herpes virus entry mediator (HVEM), on B cells in vitro and sustained its expression in vivo in melanoma patients after vaccination. Upon ligation with HVEM, BTLA inhibited CpG-mediated B cell functions (proliferation, cytokine production, and upregulation of co-stimulatory molecules), which was reversed by blocking BTLA/HVEM interactions. Interestingly, chemokine secretion (IL-8 and MIP1β) was not affected by BTLA/HVEM ligation, suggesting that BTLA-mediated inhibition is selective for some but not all B cell functions. We conclude that BTLA is an important immune checkpoint for B cells, as similarly known for T cells

HAL AMU

Serveur académique lausannois

HAL Descartes

ProbCD: enrichment analysis accounting for categorization uncertainty

Author: A Lewin
A Vinayagam
B Engelhardt
C Andersson
C Jones
D Martin
E Levy
I Rivals
Ilya Shmulevich
J Goeman
L Goodman
M Aubry
P Shannon
R Fisher
R Sealfon
R Vencio
Ricardo ZN Vêncio
S Carroll
S Maere
T Joshi
W Zhang
W Zhang
Z Jiang
Publication venue
Publication date: 01/01/2007
Field of study

As in many other areas of science, systems biology makes extensive use of statistical association and significance estimates in contingency tables, a type of categorical data analysis known in this field as enrichment (also over-representation or enhancement) analysis. In spite of efforts to create probabilistic annotations, especially in the Gene Ontology context, or to deal with uncertainty in high throughput-based datasets, current enrichment methods largely ignore this probabilistic information since they are mainly based on variants of the Fisher Exact Test. We developed an open-source R package to deal with probabilistic categorical data analysis, ProbCD, that does not require a static contingency table. The contingency table for
the enrichment problem is built using the expectation of a Bernoulli Scheme stochastic process given the categorization probabilities. An on-line interface was created to allow usage by non-programmers and is available at: http://xerad.systemsbiology.net/ProbCD/. We present an analysis framework and software tools to address the issue of uncertainty in categorical data analysis. In particular, concerning the enrichment analysis, ProbCD can accommodate: (i) the stochastic nature of the high-throughput experimental techniques and (ii) probabilistic gene annotation

arXiv.org e-Print Archive

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Nature Precedings

Neural Modeling and Control of Diesel Engine with Pollution Constraints

Author: D. Psaltis
D. T. Pham
E. S. Plumer
F. C. Chen
G. Bloch
G�rard Bloch
I. Rivals
J. P. Vila
K. J. Hunt
K. J. Hunt
K. S. Narendra
K. S. Narendra
L. Ljung
M. Blanke
M. Nørgaard
Mustapha Ouladsine
S. Chen
Xavier Dovifaaz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

The paper describes a neural approach for modelling and control of a turbocharged Diesel engine. A neural model, whose structure is mainly based on some physical equations describing the engine behaviour, is built for the rotation speed and the exhaust gas opacity. The model is composed of three interconnected neural submodels, each of them constituting a nonlinear multi-input single-output error model. The structural identification and the parameter estimation from data gathered on a real engine are described. The neural direct model is then used to determine a neural controller of the engine, in a specialized training scheme minimising a multivariable criterion. Simulations show the effect of the pollution constraint weighting on a trajectory tracking of the engine speed. Neural networks, which are flexible and parsimonious nonlinear black-box models, with universal approximation capabilities, can accurately describe or control complex nonlinear systems, with little a priori theoretical knowledge. The presented work extends optimal neuro-control to the multivariable case and shows the flexibility of neural optimisers. Considering the preliminary results, it appears that neural networks can be used as embedded models for engine control, to satisfy the more and more restricting pollutant emission legislation. Particularly, they are able to model nonlinear dynamics and outperform during transients the control schemes based on static mappings.Comment: 15 page

arXiv.org e-Print Archive

Crossref

HAL AMU

Detecting microsatellites within genomes: significant variation among algorithms

Author: A Benet
A Goffeau
A Hauth
A Smit
AT Castelo
B Harr
C Abajian
D Dieringer
D Falush
D Goldstein
E Coward
E Rivals
E Rivals
E Rivals
ER Moxon
Eric Rivals
G Benson
G Chambers
GI Bell
GM Landau
H Ellegren
I Arzimanoglou
IHGS Consortium
J Jurka
J Jurka
J Majewski
J Taylor
JE Galagan
L Jin
M Adams
M Katti
M Kayser
M Mitas
M Morgante
MT Webster
O Delgrange
O Rose
P Calabrese
P Jarne
P Martin
Philippe Jarne
R Kolpakov
R Kolpakov
R Sainudiin
R Sokal
S Kruglyak
S Kruglyak
Sébastien Leclercq
T Pupko
TD Petes
TF Smith
V Fischetti
Y Wexler
YL Lai
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Microsatellites are short, tandemly-repeated DNA sequences which are widely distributed among genomes. Their structure, role and evolution can be analyzed based on exhaustive extraction from sequenced genomes. Several dedicated algorithms have been developed for this purpose. Here, we compared the detection efficiency of five of them (TRF, Mreps, Sputnik, STAR, and RepeatMasker). Results Our analysis was first conducted on the human X chromosome, and microsatellite distributions were characterized by microsatellite number, length, and divergence from a pure motif. The algorithms work with user-defined parameters, and we demonstrate that the parameter values chosen can strongly influence microsatellite distributions. The five algorithms were then compared by fixing parameters settings, and the analysis was extended to three other genomes (<it>Saccharomyces cerevisiae</it>, <it>Neurospora crassa </it>and <it>Drosophila melanogaster</it>) spanning a wide range of size and structure. Significant differences for all characteristics of microsatellites were observed among algorithms, but not among genomes, for both perfect and imperfect microsatellites. Striking differences were detected for short microsatellites (below 20 bp), regardless of motif. Conclusion Since the algorithm used strongly influences empirical distributions, studies analyzing microsatellite evolution based on a comparison between empirical and theoretical size distributions should therefore be considered with caution. We also discuss why a typological definition of microsatellites limits our capacity to capture their genomic distributions.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

String Matching and 1d Lattice Gases

Author: A. D. Barbour
A. Dembo
B. Prum
D. Achlioptas
D. E. Knuth
E. Rivals
F. Gürsey
G. E. Uhlenbeck
G. Reinert
H. Harborth
H. S. Wilf
I. Fudos
I. Z. Fisher
J. Kleffe
Jane F. Gentleman
L. Goldstein
L. J. Guibas
L. J. Guibas
L. J. Guibas
M. Mézard
M. Régnier
M. Régnier
M. S. Waterman
M. X. Geske
Muhittin Mungan
O. Chrysaphinou
O. Chrysaphinou
O. Chrysaphinou
P. Pevzner
R. Monasson
S. B. Boyer
S. Karlin
S. Kirkpatrick
S. Robin
S. Robin
S. Robin
S. Schbath
W. Feller
Y. Fu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/08/2005
Field of study

We calculate the probability distributions for the number of occurrences

n

of a given

l

letter word in a random string of

k

letters. Analytical expressions for the distribution are known for the asymptotic regimes (i)

k \gg r^l \gg 1

(Gaussian) and

k,l \to \infty

such that

k/r^l

is finite (Compound Poisson). However, it is known that these distributions do now work well in the intermediate regime

k \gtrsim r^l \gtrsim 1

. We show that the problem of calculating the string matching probability can be cast into a determining the configurational partition function of a 1d lattice gas with interacting particles so that the matching probability becomes the grand-partition sum of the lattice gas, with the number of particles corresponding to the number of matches. We perform a virial expansion of the effective equation of state and obtain the probability distribution. Our result reproduces the behavior of the distribution in all regimes. We are also able to show analytically how the limiting distributions arise. Our analysis builds on the fact that the effective interactions between the particles consist of a relatively strong core of size

l

, the word length, followed by a weak, exponentially decaying tail. We find that the asymptotic regimes correspond to the case where the tail of the interactions can be neglected, while in the intermediate regime they need to be kept in the analysis. Our results are readily generalized to the case where the random strings are generated by more complicated stochastic processes such as a non-uniform letter probability distribution or Markov chains. We show that in these cases the tails of the effective interactions can be made even more dominant rendering thus the asymptotic approximations less accurate in such a regime.Comment: 44 pages and 8 figures. Major revision of previous version. The lattice gas analogy has been worked out in full, including virial expansion and equation of state. This constitutes the main part of the paper now. Connections with existing work is made and references should be up to date now. To be submitted for publicatio

arXiv.org e-Print Archive

Crossref